wo 2005/099261 



PCT/US2004/006796 



MULTI-STAGE MEDIA COMPRESSION TECHNIQUE FOR POWER AND 

STORAGE EFFICIENCY 

BACKGROUND OF THE INVENTION 
5 FIELD OF THE INVENTION 

The present invention generally relates to multimedia and, more particularly, to 
a multi-stage media compression method and apparatus for mobile and other 
devices. The multi-stage media compression method and apparatus provide power 
and storage efficiency for the mobile and other devices, 

10 

BACKGROUND OF THE INVENTION 

Current mobile devices, such as cell phones and Personal Digital Assistants, 
have very strict power requirements in order to maximize battery life. Therefore, the 
mobile devices are designed with CPUs that have low power and, consequently, low 

15 processing power. 

A future application that is desired for these devices is to be able to capture 
video with an embedded camera and encode (compress) the video data for efficient 
transmission through cellular networks. However, there are many difficult design 
issues for such a system. For example, the available network bandwidth in cellular 

20 networks is extremely limited and expensive. Therefore, a very high compression 
ratio is desired. Moreover, typical CPUs (even high end CPUs for PDAs) are not 
capable of performing real-time encoding of video at high compression ratios. The 
required CPU Million Instruction Per Second (MIPS) is usually at least 5 times what is 
available. Further, the amount of available memory for storage of uncompressed 
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video is very limited. For a typical PDA witli 64 Mbytes of RAM, only 20 seconds of 
320x240 @30fps video can be stored uncompressed. 

Accordingly, some solutions have been attempted to correct the above 
problems, but with only limited success, if any. For example, the brute force 
5 approach to solve this problem is to put in a CPU or other electronic circuit that 
encodes the video at high compression ratios in real time. However, this is an 
expensive solution in terms of the product cost and the battery life. 

An alternative would be to store uncompressed video and encode at a later 
time in non-real time. However, the amount of memory available would only allow a 
10 very limited video capture period (e.g., 20 seconds for 320x240 @30fps or 70 
seconds for a 14 of the preceding resolution). 

Yet another alternative would be to greatly reduce the video frame rate or the 
video resolution. However, this compromises the video quality at least 3-4 times, 
which results in a less than pleasing video entertainment experience. 
15 Accordingly, it would be desirable and highly advantageous to have a media 

compression method and apparatus for mobile and other devices that overcomes the 
above-identified problems of the prior art. 



SUMMARY OF THE INVENTION 

20 The problems stated above, as well as other related problems of the prior art, 

are solved by the present invention, which is directed to a media compression 
method and apparatus for mobile and other devices. The present inve^ntion solves 
these problems by implementing a real time Low Complexity (LC) encoded bit stream 
media compression step before a non-real time High Complexity (HC) encoded bit 
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stream media encoding step. Advantageously, the present invention provides power 
and storage efficiency for the mobile and other devices. 

According to an aspect of the present invention, there is provided an 
apparatus for compressing media content in an electronic device having a video 
5 capture device for capturing the video content. The apparatus includes a real-time, 
Low Complexity (LC) video compressor for compressing the video content into an LC 
encoded bit stream in real-time. The apparatus further includes a non-real-time High 
Complexity (HC) video compressor for generating an HC encoded bit stream from the 
LC encoded bit stream in non-real-time. 

10 According to another aspect of the present invention, there is provided a 

method for compressing media content in an electronic device having a video capture 
device for capturing the video content. The method includes the step of 
compressing, in real-time, the video content into a Low Complexity (LC) encoded bit 
stream. The method further includes the step of generating, in non-real-time, a High 

15 Complexity (HC) encoded bit stream from the LC encoded bit stream. 

These and other aspects, features and advantages of the present inv/ention will 
become apparent from the following detailed description of preferred embodiments, 
which is to be read in connection with the accompanying drawings. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram illustrating an apparatus 100 for compressing media 
in a mobile or other device, according to an illustrative embodiment of tine present 
invention; 
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FIG. 2 is a flow diagram illustrating a method of media compression for a 
mobile or other device, according to an illustrative embodiment of the present 
invention; 

FIG. 3 is a diagram illustrating a Low Complexity (LC) encoded bit stream 310 
5 and a High Complexity (HC) encoded bit stream 320 for Intra frame re-use in HC 
encoding, according to an Illustrative embodiment of the present invention; and 

FIG. 4 is a diagram illustrating a mobile device 400 in accordance with an 
illustrative embodiment of the present invention. 



10 DETAILED DESCRIPTION OF THE INVENTION 

The present invention is directed to a media compression method and 
apparatus for mobile and other devices. The present invention provides power and 
storage efficiency for the mobile and other devices. The present invention may be 
implemented with respect to mobile device including, but not limited to, cellular 

15 telephones (hereinafter "cell phones), Personal Digital Assistants (PDAs), 
camcorders, and digital cameras, and so forth. The present Invention may also be 
implemented with respect to non-mobile devices including, but not limited to, 
Personal Video Recorders (PVRs), and so forth. Moreover, the present invention 
may be implemented with respect to video and/or audio media. 

20 It is to be understood that the present invention may be implemented in 

various forms of hardware, software, firmware, special purpose processors, or a 
combination thereof. Preferably, the present invention Is implemented as a 
combination of hardware and software. Moreover, the software is preferably 
implemented as an application program tangibly embodied on a program storage 

25 device. The application program may be uploaded to, and executed by, a machine 



wo 2005/099261 PCT/US2004/006796 

5 

comprising any suitable arcfiitecture. Preferably, the macliine is implemented on a 
computer platform having hardware such as one or more central processing units 
(CPU), a random access memory (RAM), and Input/output (I/O) interface(s). The 
computer platform also includes an operating system and microinstruction code. The 
5 various processes and functions described herein may either be part of the 
microinstruction code or part of the application program (or a combination thereof) 
that is executed via the operating system. In addition, various other peripheral 
devices may be connected to the computer platform such as an additional data 
storage device and a printing device. 

10 It is to be further understood that, because some of the constituent system 

components and method steps depicted in the accompanying Figures are preferably 
implemented in software, the actual connections between the system components (or 
the process steps) may differ depending upon the manner in which the present 
invention is programmed. Given the teachings herein, one of ordinary skill in the 

15 related art will be able to contemplate these and similar Implementations or 
configurations of the present invention. 

FIG. 1 is a block diagram illustrating an apparatus 100 for compressing media 
in a mobile or other device, according to an illustrative embodiment of the present 
invention. FIG. 2 is a flow diagram Illustrating a method of media compression for a 

20 mobile or other device, according to an illustrative embodiment of the present 
invention. It is to be appreciated that the media may include video and/or audio 
content. 

The apparatus 100 includes a real-time media compressor 110, a memory 
device 120, and a non-real-time media compressor 130. The non-real-time media 
25 compressor 130 includes a Low Complexity (LC) decoder 132 and a High Complexity 
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(HC) encoder 134. The real-time media compressor 110 employs a low compression 
ratio and low CPU complexity in compressing media in comparison to the non-real- 
time media compressor 130, which employs a high compression ratio and high CPU 
complexity. It is to be appreciated that in some embodiments of the present 
5 invention, the LC encoder 132 and the HC encoder 134 are implemented on a same 
processor device. 

Media is captured by a capture device 199 (step 210). The media may include 
video and/or audio content. In the case of video content, the capture device 199 may 
be, e.g., a camera or image sensor together with an Analog-to-Digital Converter 
10 (ADC), or some other type of video capture device. In the case of audio content, the 
capture device may be a microphone together with an ADC, or some other type of 
audio capture device. 

The uncompressed media is forwarded to the real-time media compressor 110 
and is compressed into a Low Complexity (LC) encoded bit stream by the real-time 
15 media compressor 110 (step 220). The real-time media compressor 110 can be 
considered an intermediate encoder that operates in real-time and performs 
compression on the incoming bit stream. The compression implemented by the real- 
time media compressor 1 10 is preferably on the order of 20:1 or greater. 

The LC encoded bit stream is forward to, and stored by, the memory device 
20 1 20 (step 230). Preferably, the memory device 120 is a local memory device such as 
a Random Access Memory (RAM), a memory storage card (e.g., FLASH or 
MICRODRIVE), etc. 

Depending on the CPU capability and architecture, the next step of high 
compression efficiency encoding can begin while the media is still being captured by 
25 the capture device 199 or when capturing is complete. An HC encoded bit stream is 
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generated from the LC encoded bit stream by the non-real-tlme media compressor 
130 (step 240). Once the HC encoded bit stream is complete or while still being 
encoded, the mobile or other device can send the stream to some other device 197 
or to the network 1 98 (step 250), which may be a cellular or other type of network. Of 
course, if the HC encoded bit stream is sent to the network, it is likely that the HC 
encoded bit stream will be sent to some device within the network 197. 

A description will now be given of methods of LC and HC video compression, 
according to an illustrative embodiment of the present invention. It Is to be 
appreciated that the present invention is not limited to the methods of LC and HC 
video compression described herein, and any other methods for LC and HC video 
compression may be utilized by the present invention while maintaining the spirit 
thereof. Moreover, as noted above and described in further detail herein below, the 
present invention may also be applied to audio media and is similarly not limited to 
the methods of LC and HC audio compression described herein, and any other 
methods for LC and HC audio compression may be utilized by the present Invention 
while maintaining the spirit thereof. 

The LC and HC formats can be defined by any given application. The goal is 
that the LC compression is relatively low in complexity compared to the HC 
compression, such that the LC compression can run in real-time on a large variety of 
CPUs for a given application such as, for example, a digital camcorder. The LC 
compression must be sufficient enough that a high level of compression is performed 
(typically, the desired compression level is 20:1), such that a significant length of 
content can be saved on a small storage device. Each application has its own 
platform constraints of hardware and CPU capability and storage size availability. 
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For the HC format, typically the best compression possible should be 
considered, as long as the real-time decoders can be utilized for the end device for 
which the HC bitstream is targeted. 

For HC generation, the Motion Picture Experts Group 4 (MPEG4)-part 10 (also 
5 known as "Joint Video Team (JVT) or (H.264)) encoding method is preferred. 
MPEG4-part 10 currently has the highest encoding efficiency of any known method. 
MPEG4-part 10 is capable of 184:1 compression ratios (approximately 2-3 times as 
efficient as MPEG2). 

MPEG4-part 10 uses Intra (I), forward Predictive (P), and Bi-directionally 
10 predictive (B) frame types. Intra frames are the least efficient and P and B are much 
more efficient. Thus, to reduce HC encoding time, it is preferably to use MPEG4-part 
10 Intra frames for the LC compression. That is, the LC encoder produces MPEG4- 
part 10 Intra frame only sequences at a compression efficiency ratio of approximately 
20:1. Then, the HC encoder can re-use the Intra frames it needs and replace some 
15 number (any number) of the other intra Frames. FIG. 3 is a diagram illustrating a 
Low Complexity (LC) encoded bit stream 310 and a High Complexity (HC) encoded 
bit stream 320 for Intra frame re-use in HC encoding, according to an illustrative 
embodiment of the present invention. The LC encoded bit stream 310 includes only 
Intra (I) frame types, while the HC encoded bit stream 320 includes Intra, forward 
20 predictive (P), and bi-directionally predictive (B) frame types. 

The HC encoder 134 would have to decode all LC Intra frames since 
uncompressed reference frames are used in encoding P and B frames. However, the 
extra step of encoding the Intra frames of the HC bit stream would not have to be 
done. 
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As an example of the advantages of such a system, consider a PDA with 
64MB of RAM. With a 20:1 compression ratio, approximately 26 minutes of LC 
encoded bit stream could be stored (presuming 320x240 @30fps). A typical use in 
the application is to store short video segments such as a video message that 
includes the sender talking and/or the scenery in the local area. Given that a HC 
encode might take 5X of real time, then the person could take 5 minutes of video and 
then the HC encoded stream would be complete 25 minutes later. Of course, the 
user could also take 26 minutes of an LC encoded bit stream, but then would have to 
wait 2 hours before the HC encoded stream was complete. This model supports 
taking 26 minutes of video on average every 2 hours. Typical consumer usage of 
handheld video recorders involves taking no more than several minutes of content at 
a time. 

FIG. 4 is a diagram illustrating a mobile device 400 in accordance with an 
illustrative embodiment of the present invention. The mobile device 400 includes a 
memory bus 401, a Random Access Memory (RAM) 402, a camera sensor 404 
having a lens 403, an Analog-to-Digital Converter 406 (ADC), a CPU 408, a 
baseband miodulation module 410, an audio Digital-to-Analog Converter (DAC) 412, 
a graphics controller 414, a Radio Frequency (RF) transmitter 416, a 
speaker/headphone 418, a display 420 (e.g., a Liquid Crystal Display (LCD) or some 
other type of display), an antenna 460, a microphone 477, and an Analog-to-Digital 
Converter (ADC) 478, The mobile device 400 communicates with a cellular network 
499. 

Video Is captured from the camera sensor 404 (e.g., Charge Coupled Device 
(CCD), Complimentary Metal Oxide Semiconductor (CMOS), and so forth), digitized 
and delivered to the CPU 408. The CPU 408 performs an LC compression operation 
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SO as to LC compress the captured video in real time and place the LC encoded bit 
stream in the RAM 402. When the CPU 408 has MIPS available for HC encoding, 
then the CPU 408 can perform the HC compression and remove the LC encoded 
stream from the RAM 402 to free memory space. This HC encoded stream can then 
5 be sent through any networl< including low bandwidth networks such as cellular 
network 499. 

In an alternative embodiment of the invention, a different LC compression 
could be used such as motion JPEG, which is widely supported in mobile devices 
and even in camera sensor Integrated Circuits (ICs) as a post process. In this way, 

10 the CPU could be dedicated for HC compression since the MJPEG encoding is 
external to the CPU. 

A brief description will now be given of some of the many advantages of the 
present invention. The present invention can be applied to any mobile device 
architecture capable of at least LC real time encoding. From the smallest cell phone 

15 to the most advanced PDA. Moreover, HC real time encoding hardware is not 
required and, therefore, saves on hardware costs in the device as well as power 
usage. Further, the optimum use of the low bandwidth channel is achieved since HC 
compression is the most efficient. Also, by using an intermediate LC compression, 
20 times the amount of video can be captured by the consumer. This allows many 

20 minutes of video instead of just a few seconds and meets the typical usage of a 
camcorder consumer. 

A description will now be given of other applications and devices to which the 
present invention may be applied while maintaining the spirit of the present invention. 
Such application includes, but are not limited to, Personal Video Recorders (PVRs), 

25 camcorders/digital cameras/ and audio applications. It is to be appreciated that given 
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the teachings of the present invention provided herein, one of ordinary skill in the 
related art will contemplate these and various other devices, applications, and 
implementations to which the present invention may be applied while maintaining the 
spirit of the present invention. 

With respect to PVRs, It is desirable for the content to be encoded in the most 
efficient manner. However, the content must be captured in real-time for Immediate 
playback and simultaneous storage on the HDD (hard disk drive). An LC 
compression can be used for this Immediate real-time requirement and then, at a 
later time, the LC encoded stream can be re-encoded (as described herein) with HC 
non-real-time compression. This could take place whenever the PVR Is not in active 
use, or perhaps during the night time hours. 

The advantage for the PVR is that once an HC encoding is complete, then the 
LC encoded version can be removed and, due to the higher bit rate efficiency of the 
HC stream, more HDD space available is then available. 

With respect to camcorders and digital cameras, more content can be stored 
on such devices by using the HC non-real-time encoding after LC encoding in real 
time capture mode. Since Camcorder use is generally in short bursts that last, on 
average, up to 5 minutes, the LC to HC conversion could take place very easily. 

The advantage would be to have a lower complexity and lower cost camcorder 
with a higher capacity. In the case where the camcorder is connected to a network 
device of any kind, the HC compression allows the video signal to be distributed 
faster and with less bandwidth. Many camcorders use Digital Video (DV) 
compression, which is an LC type of Intra frame compression similar to MPEG2 Intra 
frames. This could still be used in DV camcorders for the LC format, but with the HC 
format being JVT or some other format such as, for example, MPEG2. 
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With respect to audio applications, tlie present invention may also be applied 
thereto. For example, an audio recorder (e.g., in a camcorder, PDA, and so forth) 
could use an LC encoding for real-time, and then a HC encoding for optimizing 
storage and transmission. As an example, Moving Picture Experts Group Layer-3 
Audio (MPS) could be used for LC encoding and MPS Pro could be used for HC 
encoding. 

Although the illustrative embodiments have been described herein with 
reference to the accompanying drawings, it is to be understood that the present 
invention Is not limited to those precise embodiments, and that various other changes 
and modifications may be affected therein by one of ordinary skill in the related art 
without departing from the scope or spirit of the invention. All such changes and 
modifications are intended to be included within the scope of the invention as defined 
by the appended claims. 



